Detecting Underspecification in SNOMED CT Concept Definitions Through Natural Language Processing

نویسندگان

  • Edson José Pacheco
  • Holger Stenzhorn
  • Percy Nohama
  • Jan Paetzold
  • Stefan Schulz
چکیده

Quality assurance and audit issues play a major role in maintening large biomedical terminology, such as SNOMED CT. Several automatized techniques have been proposed to facilitate the identification of weak spots and suggest adequate improvements.In this study, we address a well-known issue within SNOMED CT: Albeit the wording of many free-text concept descriptions suggests a connection to other concepts, they are often not referred to in the logical concept definition.To detect such inconsistencies, we use a semantic indexing approach which maps free text onto a sequence of semantic identifiers. Applied to SNOMED CT concepts without attributes, our technique spots refinable concepts and suggests appropriate attributes, i.e., connections to other concepts. Based on a manual analysis of random samples, we estimate that approximately 18,000 refinable concepts can be found.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identifying Potentially Missing Hierarchical Relations in SNOMED CT based on Lexical Features - Impact of Synonyms and Lexico-syntactic Constraints

Introduction The quality assurance of large bio-ontologies is extremely critical for their effective and continued use and is an active area of research1. For example, recent investigations highlighted issues in the hierarchical structure of SNOMED CT and its detrimental effects on biomedical applications2. Previous work by one of the authors3 established a method to identify potentially missin...

متن کامل

Learning Formal Definitions for Snomed CT from Text

Snomed CT is a widely used medical ontology which is formally expressed in a fragment of the Description Logic EL++. The underlying logics allow for expressive querying, yet make it costly to maintain and extend the ontology. Existing approaches for ontology generation mostly focus on learning superclass or subclass relations and therefore fail to be used to generate Snomed CT definitions. In t...

متن کامل

OntoVerbal-M: a Multilingual Verbaliser for SNOMED CT

OntoVerbal-M is an ontology verbaliser that transforms OWL into fluent natural language paragraphs in multiple languages. We describe the application of OntoVerbal-M to SNOMED CT, whereby SNOMED CT classes are presented as textual paragraphs in both English and Mandarin through the use of natural language generation. SNOMED CT is a large description logic based terminology for recording in elec...

متن کامل

Semantic Tagging of Medical Narratives with Top Level Concepts from SNOMED CT Healthcare Data Standard

Medical narratives written by clinicians constitute critical information in healthcare domain and are required to be correct with respect to contextual meaning. SNOMED CT (Systematized Nomenclature of Medicine -Clinical Terms) is a standardized reference terminology that consists of 390023 SNOMED CT concepts with SNOMED CT codes. This paper describes the extraction of SNOMED CT concepts from fr...

متن کامل

بررسی تطبیقی سیر تکامل و ساختار سیستم های نامگذاری نظام یافته پزشکی SNOMED در کشورهای آمریکا ، انگلستان و استرالیا 86-85

Background and Aim: Systematized Nomenclature of Medicine systems are the important supportive for electronic health record in registration and retrieval of data. Systematized Nomenclature of Medicine - Clinical Terms (SNOMED CT) is the most comprehensive language and then the consistency of exchanged data across health care providers and finally the high effectiveness of health care. Material...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • AMIA ... Annual Symposium proceedings. AMIA Symposium

دوره 2009  شماره 

صفحات  -

تاریخ انتشار 2009